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Abstract 



We adress the problem of spherical deconvolution in a non para- 
metric statistical framework, where both the signal and the operator 
kernel are subject to error measurements. After a preliminary treat- 
ment of the kernel, we apply a thresholding procedure to the signal 
in a second generation wavelet basis. Under standard assumptions 
on the kernel, we study the theoritical performance of the result- 
ing algorithm in terms of L p losses (p > 1) on Bcsov spaces on 
the sphere. We hereby extend the application of second generation 
spherical wavelets to the blind deconvolution framework [16]. The 
procedure is furthermore adaptive with regard both to the target 
function sparsity and smoothness, and the kernel blurring effect. We 
end with the study of a concrete example, putting into evidence the 
improvement of our procedure on the recent blockwise-SVD algo- 
rithm [6]. 
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second generation wavelets; nonparametric adaptive estimation; linear in- 
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1 Introduction 



1.1 Statistical framework 



Consider the following problem : we aim at recovering a signal / G L 2 (S 2 ). 
f is not observed directly, but through the action of a blurring process 
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modeled by a linear operator K. To this end, we consider the classic white 
noise model, where the available information is the noisy version 

g £ = Kf + eW (1.1) 

of /, where W is a white noise on 8 2 and K : L 2 (S 2 ) — > L 2 (§ 2 ) is a 
measurable operator. We further restrict the shape of K by assuming 
that K is a convolution operator on L 2 (8 2 ), a classic framework ([13], 
[17] and [16]) enjoying convenient mathematical properties (see Part 1.2). 
This model is equivalently formulated in a density estimation framework, 
in which one aims at recovering the density / of a random variable X 
on § 2 from a ?i-sample (0\X\, n X n ) of Z = OX (with the analogy 
e ~ n" 1 / 2 ), where is a random element in the group of SO3 with density 
hg, and Z has a density f z € L 2 (§ 2 ). In practice, the blurring operator K 
is seldom directly observable and is itself subject to measurement errors. 
This covers the cases where either K is unknown but approximated via 
preliminary inference, or K is known but always observed with noise for 
experimental reasons. The result is a noisy version Kg, satisfying 

K S = K + SB (1.2) 

where B is a gaussian white noise on L 2 {SO^), independent from W. 
The relevance of this generic setting was adequately discussed in Efro- 
movich and Kolchinskii [8] and Hoffmann and ReiB [14], and covers nu- 
merous fields of applications. Let us mention, for example, image process- 
ing, a field which covers astronomy as well as electronic microscopy where 
an image, assimilated to a function / £ L 2 ([0, l] 2 ) is observed through 
its convolution with the Point Spread Function of the measuring device, 
which hence requires to be estimated in first instance (see [23], [1]). 
For u,u' ,v,v' ,w,w' E L 2 (S 2 ), observable quantities obtained from 
1.1 and 1.2 hence take the form (Kf,u) + ea(u) (signal) and 
(Ku,v) + 5f3(v,w) (operator) where a(u) ~ A/"(0, IHI2), P(v,w) ~ 
A/"(0, \\v H2 H^lk) and E[a(u)a(w')] = {u, it')-L 2 (S 2 )> K[/3(v, w)/3(v', w')] = 

As we stated, we deal with a convolution on the 2-dimensional sphere. 
Namely, if Z admits a density h on SO3 with respect to the Haar measure, 
then Kf has the following expression 

Kf(u) = [ f(g~ 1 u)h(g)dg (1.3) 
JSOi 

where dg is the Haar measure on SO3. That is, / is averaged on a neigh- 
bourhood of oj with weight h(g) for each rotation g^ 1 applied to u. This 
problem, together with the introduction of needlets, is for example well 
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illustrated by the study of ultra high energy cosmic rays (UHECR) . 
An UHECR is a radiation hitting the earth with very high energy, and 
whose physical origin is still unknown. Yet the understanding the mech- 
anisms at work in this phenomenon is fundamental. Current hypothesis 
involve pulsars, hypernovaes or black holes. Robust statistic tools are 
heavily required, in order to properly estimate the density shape of the 
radiation, which is highly related to the physical processes at stake in its 
formation. One could ask, for example, whether the density is uniformly 
distributed among the sphere, indicating a cosmological cause, or if it 
is the superposition of localized spikes. In the latter case, it is crucial 
to determine precisely the positions of this spikes. In practice however, 
observations (X±, X n ) of such radiations are often subject to various 
physical perturbations, translated through the impulse response of the 
measuring device. We modelize these by a random rotation 9, which is 
to say we actually observe (0±Xi, ...0 n X n ) realisations of the random 
variable Z = 6X. The difficulty of the problem is characterized by the 
spreading of hg around the identity : the less localized it is, the more dif- 
ficult the estimation of / should be. Moreover, the law of 6 is not known 
in general, even if some assumptions can restrict its shape. In this case, 
preliminary inference is necessary, and leads to an estimator K$ of K. 

Case of a known operator 

We shall concentrate here on the case where 6 = 0, and describe the 
path which finally led to the introduction and use of needlets in this 
setting. Spherical harmonics constitute the most natural set of functions 
to expand a target function / £ L 2 (S 2 ), and present a structure highly 
compatible with deconvolution problems. It prompted Healy et al. [13] 
to solve the deconvolution problem with their use, hereby reaching op- 
timal L 2 rates of convergence (Kim and Koo [17]). Unfortunately their 
performances can prove quite poor in general cases, since they lack local- 
ization in the spatial domain (see [12]). More recently, spherical wavelets 
were introduced (Shroder and Sweldens [25] and Narcowich and Ward 
[19]) and have found various applications for a direct estimation of /, in- 
cluding geophysics or atmospherics sciences (see for example Freeden and 
Schreiner [9] or Freeden and Michel [10]). However, these wavelets, which 
rely on a spatial construction, have an infinite support in the frequency 
domain, and hence are not suited for the case of spherical deconvolution, 
unlike spherical harmonics. The solution to this problem was brought by 
Narcowich and Ward [19], who introduced a new set of functions, called 



3 



needets, which preserve the frequency localization of spherical harmonics 
as well as the compatibility with inverse problems, all the while remedy- 
ing their lack of spatial localization. Since then, needlets became widely 
used in astrophysics (Marinucci et al. [18] or Guilloux et al. [12]) or brain 
shape modeling (Tournier et al. [26]). In particular, Kerkyacharian et al. 
[16] reached near-minimax rates of convergence for LP losses (1 < p < oo) 
in the present spherical deconvolution setting. 

Resolution when K is unknown : Galerkin projection 

In the case of unknown operator K, the main methods involve SVD, 
WVD and Galerkin schemes (see [3], [4], [14] for example). We now give 
an overview of the so called Galerkin method and present its application 
to blind-deconvolution. It is based upon on a discretization of 1.1 and 
1.2 through the choice of appropriate test functions. Suppose we want 
to recover / from the observation g = Kf. Let X, Y C L 2 (§> 2 ) be two 
finite dimensional subset which admit the respective orthogonal bases 
<p = ((fk)k=i,...,n and = ($fc)fcei,...,n- The Galerkin approximation f n of 
/ is the solution of the equation 

(Kf n ,v) =(g,v)VveY 

4* J2( K( Pk, $fc')(/n, Vk) = {g, Vtf < n (1.4) 

k<n 

f n is easily computable, as the equivalent solution of the finite dimen- 
sional linear system g n = K n f n where g n is the vector whose components 
are ((<?, y>k))k<n and K n the matrix with entries ((Kip^, ^p' k ))k,k'<n- Hence, 
this method relies on the discretization of the operator K, together with 
the discretization of the function /. 

Galerkin projection were successfully applied to blind deconvolution to 
reach optimal rates of convergence on generic Hilbert spaces ([8]) or on 
Besov spaces through wavelet-thresholding technics ([14], [4]). 
Its remains to handle two practical problems : the algorithm must include 
and articulate two essential steps, namely the inversion of K and the reg- 
ularization of the datas through a projection/thresholding scheme . Note 
that both the signal and the operator K can be subject to regularization 
(see [14], [6]). 

The second practical problem remains in chosing the right functions ip^ 
and This choice should ideally answer the dilemma to find a set which 
is both compatible with the representation of / (through the belonging 
to a certain set of functions) and with the structure of K (see [14], [6]). 
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Spherical harmonics respond optimaly to the problem in the case of spher- 
ical deconvolution on Sobolev spaces for a 1? error, since they realize a 
blockwise-SVD decomposition of K, as shown in 1.1. More importantly 
here, when 5 is non negative, they allow a fine treatment of K$ thanks to 
the sparse structure of the original operator K, which allowed Delattre 
et al. [6] to reach optimal L 2 -rates of convergence for a natural class of 
operators and functions. Thus, we should always seek to preserve this 
property of sparsity whenever possible. 

1.2 Harmonic analysis on SO 3 and § 2 

The next part provides preliminary tools in order to apply a blockwise 
scheme to the case of spherical deconvolution. It is a quick overview of 
harmonic analysis on the spaces § 2 and SO3 which is mostly inspired by 
Healy et al. [13]. Let us define the Euler matrices 

(cos</? —sin ip 0\ / cos# sin#\ 

svnip cosip , a(9) =10 10 
1/ \-sin6l cosfl/ 

where <p G [0,27r), 9 G [0, tt). 

Every rotation g in SO3 is the product of 3 elementary rotations : 

e = u{ip)a{9)u{ip) (1.5) 

where ip,ip G [0,2n), 9 G [0,7r) are the Euler angles of g. Let / G N and 
—l<m,n<l. We also define the rotational harmonics 

R l mn {^9^) = e-^+^Pi n (cos(0)) (1.6) 

where P l mn are the second type Legendre functions. 

The functions R l mn , I G N, |m|,|ra| ^ / are the eigenfunctions of the 
Laplace-Beltrami operator on SO 3, associated with the eigenvalues 21 + 1. 
Therefore, the system (\/2Z + lR l mn ) l mn forms a complete orthonormal 
basis of L 2 (S0 3 ). Let h G L 2 (S0 3 ). For all / > 0, the projection of h on 
the space of rotational harmonics with degree / is 

1 

Ell r>l 
'''mn -^mn 

m,n=—l 

where h l mn is the (l,m,n) Fourier coefficient of h, defined by 

h l mn = / Hg)R l mn(g) d vg (1-7) 
JS03 
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An analogous study is available on § 2 . Any point oj G S 2 is determined 
by its spherical coordinates (9,<p): 

ui = (sin(0) cos(yj), sin(#) sin(<^), cos(0)) (1.8) 

where G [0, 7r) and </? G [0, 27r). Let I a positive integer, m,n two inte- 
gers ranking from —I to /. Define the following functions, known as the 
spherical harmonics, on 8 2 : 

^-^rf^§^osm (i.9) 

where are the Legendre functions. The set (Y^J constitutes an or- 
thonormal basis of L 2 (S 2 ). Note HI; the space of spherical harmonics of 
degree / and Pi the orthogonal projector onto EI;. For every function 

/ G £ 2 (§ 2 ), 

/ 

^ ~] frnX^m 
m= — l 

where y m is the (l,m) Fourier coefficient of /, defined by 

f l m = [ f{u)Yi{uj)duj 

The term of Blockwise-SVD finds its roots in the following proposition, 
which expresses the link between Fourier coefficients of h * f and those of 
h and /. A proof is present in [13]. 

Proposition 1.1 (Blockwise property) . Let h G L 2 (SO^) and f G L 2 (S 2 ) 
The Fourier coefficients of h* f are 

i i 

( W)L = E timJL = E < h * ^ y -)/n 
n=— I n=—l 

Hence, if K is a convolution operator over L 2 (S 2 ) and / G L 2 (S 2 ), and 
if we note, K l the matrix ({KY^, Y^)) , G M 2 /+i(C) and f the vec- 
tor ((/, y^))| m |<i, Proposition 1.1 translates into (Kf) 1 = K l f l . Hence, 
turning back to the Galerkin projection of K, take tp = $ = (Ym)mli 
\m\ < I. Proposition 1.1 actually implies that the Galerkin matrix 
ii>o \m,i\<li i=\ 2 * s s P arse ' with blocks K on its diagonal. 
This justifies the denomination of blockwise-SVD decomposition. In the 
sequel, if / G L 2 (S 2 ) and K : L 2 (S 2 ) ->■ L 2 (S 2 ) is a convolution opera- 
tor, we will refer indifferently to Pif or / , and to PiKPi or K l . Be- 
sides, due to Parseval's formula, we also have H-Pz/Hza^) = H/^l^^+i) 
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and ||-F*z-K'-Pz|Il 2 (s 2 )-»l 2 (s 2 ) = \\K ||<a(c 2I + 1 )^ 3 (c 2l + 1 )- Turning back to the 
original problem and reminding Proposition 1.1, we can reformulate the 
equivalent problem, obtained by projecting 1.1 and 1.2 on every space Eh: 

V/ > 0, g[ =K l f + eW l (1.10) 

VZ > 0, K\ =K l + 5B 1 (1.11) 

where is a centered gaussian vector with covariance I21+1, an d B l is 
a (2/ + 1) x (21 + 1) matrix whose entries are iid variables with common 
law Af (0,1). 

As we said, spherical harmonics however show great inconvenients when 
used in the estimation of a generic function / G L 2 (§ 2 ). We turn to the 
presentation of far more accute functions to this end. 



2 Needlets 

2.1 Construction of Needlets 

Needlets were introduced in Narcowich et al. [20] , and used in the frame- 
work of density estimation on the sphere by Kerkyacharian and Picard 
[15], Baldi et al. [2] and Kerkyacharian et al. [16]. As their construction 
relies on a rearrangement of spherical harmonics, they inherit the very 
useful stability properties of the latter in inverse problems (as expressed 
in 1.1). In addition, whereas spherical harmonics' supports spread all 
over the sphere, needlets are almost exponentially localized around their 
respective centers, thus allowing a fine multi-resolution analysis and a 
description of very general regularity spaces on S 2 . 



Needlet framework 



As we have seen, the following decomposition holds : L 2 (§ 2 ) = 

1=0 

The orthogonal projector P/ on Hh can be written 

P,(/) = J L l ((x,y))f(y)dy = J £ Yi(x)YUi)f(y)dy 



00 

17 V 



m=—l 



where L\ is the Legendre polynomial of degree I, and (.) stands for the 
usual scalar product on M 2 . Finally, the fact that Pi is a projector implies 
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the identity 

Li((x, y))L k ((y, z))dz = Si ik Li((x, z)) (2.1) 



Littlewood-Paley decomposition 

Let a be a C°°(R) symetric function, compactly supported in [—1,1], 
decreasing on R + , such that for all x G R, < a(x) < 1 and for all 
\x\ < 1/2, \a{x)\ = 1. Define , for all x G R, 

t> 2 ( x ) = - a(x) 

b 2 is a positive function, supported in [—2; —1/2] U [1/2; 2] , satisfying by 
construction 

V|x|>l,^6 2 (|) = l (2.2) 

j>o 

Define the kernels 

Ajfay) =^2A^j)H(^y)), and Mj(x,y) =Y,K^)H(x,y)) 

l>0 l>0 

(2.3) 

and the associated operators 

J 

Bjf = / A j (x,y)f(y)dy and Ajf = V Bjf (2.4) 

with the convention B-if = Pof- Note that the two sums in 2.3 are 
finite since b(^f) = ii I £ Lj, where Lj is the set of integers between 
2 jf ~ 1 and 2 jf+1 — 1. It is straightforward to show that, for all / G L 2 (S 2 ), 

imii =££</. ^,„> a ( 2 - 5 ) 

One of the main results in Narcowich et al. [21] is that Aj also mimicks 
the best polynomial approximation of / with respect to ||.|| p for all p > 1, 
as expressed in the following theorem: 

Theorem 2.1. For all p G [l,oo[, if f € L p (§ 2 ), then 

\\Ajf-f\\ p ^0 as J^oo 
, wwtft uniform convergence if f G C°(S 2 ). 
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Space discretization 

Proposition 2.2 (Quadrature formula). . Note V\ the set of polynoms 
with degree less than I on S 2 . For each I > 0, there exists a finite set Z\ 
of cubature points, and non negative reals (X rj ) r) £z l such that 



Vfen, [ f(cj)du> = V x v f(v) 

Since b(±) + only if 2-?'" 1 < I < 2^' +1 the function z i-> Mj(x,z) 
belongs to Vtf+i-i, and z \-t Mj(x, z)Mj(z,y) is an element of V 2 j+ 2 -2- 
For more convenience, we will note Z 2 j+2_ 2 = Zj. Hence, Bj writes 

B j(f)= I (J2 ^ M 3( x >v)M 3 ( V ,y)dz)f(y)dy 
= Y] y/^Mj(x,rj) [ ^/\M j (r ] ,y)f(y)dy 



The functions ^„ = ^yX v Mj(.,r]), are called needlets. Furthermore, it 
can be prooved that the cubature points r/ and weights A^ can be chosen 
so that the two following conditions are verified, 

c~ l 2 2j < card(Z j ) < c2 2j and c -1 2 -!y < < c2~ 2j (2.6) 

with a constant c > 

2.2 Besov spaces 
Properties of needlets 

By construction, needlets are well localized in frequence (C°°, compactly 
supported). A crucial result proved by Narcowich et al. [21] shows that 
they are furthermore near exponentially localized in space. 

Theorem 2.3. Let j > 0, rj G Zj. For all M > 0, there exists Cm > 
such that 



V* £S 2, 1^)1 < il+ ^ <") 



where d(x,y) = arccos((x, y)) is the geodesic distance on the sphere. 
To illustrate this point, we represented two needlets of level j = 2, 3 on 
figure 1. The following properties are all consequences of this localization 
property. 
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o< 1 -1 °v 



ox 1 -1 °y 



Figure 1: A Spherical representation of two needlets (level j — 2,3 from left to right) 
centered around the point (0,0,1). The darkened zones correspond to the regions where 
the needlet is high. 

Proposition 2.4. For all p > 1 (with the convention l/oo = 0), there 
exists c p ,Cp, D p > such that 

Cp&G-^ < Ujjp < Cp2 3 ^ (2.8) 

Proposition 2.5. For all p € [l,+oo], there exists a constant C p such 
that for all f £ L P (S 2 ), 

ll^-(/)ll P <C p ||(|A 7? |||V^|| P ) ??e2 .||, P (2.9) 

Moreover, || (|A^| HV'j^llp)^^. \\ iP < \\f\\p (2-10) 

Construction of Besov spaces 

Besov spaces on the sphere naturally generalize the usual approximation 
properties of regular functions, all the while being simply characterized 
with the help of needlets. A complete description, and the proofs of the 
results claimed in this part can be found in Narcowich et al. [20] or Kerky- 
acharian and Picard [15]. Let / : § 2 i— > M. be a measurable function and 
let -Efc j7r ("7r > 1) be the distance of / to Vk with respect to [[.[[l^, that is 

Ek,«=in£ k \\f-P\\L« 
Theorem 2.6. Let < s < oo, 1 < p < oo and < r < oo. Let f G L n . 
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The following statements are equivalent and define the Besov space B^ r . 

(J2k rs E k ,MY) 1/r <°° (2.H) 

(^2^ 2Ji J/f) 1/r <oo (2.12) 

j>0 

3^-€^(N), \\BjfWv = ^2-3° (2.13) 
3^ e **(N), ( ]T l^niVi,,^) 17 " = £;2^ s (2.14) 

B^ q is a Banach space, associated with the norm 

ii/ii^ r = ii2^^ 2 (i-i))(^i/3 jVr/ r)^|| r 

Besov spaces satisfy the following includings, all of which derive from 
Holder's inequality. 

Proposition 2.7. Besov embeddings. Let s > 0, 1 < p, ir, r < oo 

• B^ r GB s Ptr tfiT>p. 

• B^ r C B^"^ i/vr < p and s - 2(± - ±) > 

• .B^ r c C°(S 2 ) if s > -, where C°(§ 2 ) is the set of continuous func- 
tions on S 2 . 

3 Estimation procedure 

We turn to the presentation of our procedure of Blind Deconvolution 
using Needlets (BND) and derive rates of convergence for generic L p 
losses on Besov spaces. A natural idea would be to take needlets as test 
functions in equation 1.4 since they represent / efficiently Unfortunately 
the ensuing Galerkin matrix ((ifi/ij^j^t^Jj^p^^ /jxjaf^^ iids many 
non-zero entries, due to the fact that the frequency levels of i^j yTI and 
ifih,a overlap if \j — h\ < 1. The choice of the functions is far more 
indicate, moreover the ensuing matrices K l enter naturally in the needlets 
decomposition of g £ since, with the use of Parseval's formula, we have 
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Before entering into details in the procedure, we need to precise the blur- 
ring effect of K with the introduction of a constant v called degree of 
ill-posedness (DIP) : 

Assumption 3.1 (Spectral behaviour of K). There exists v > 0, 
Qi(K),Q 2 {K) > such that, for all I E N* ; 

Qil u <\\{K l r l \\ op <Q 2 l u (3.1) 
We note KL v {Qi,Q2) the set of operators satisfying this assumption. 

Assumption 3.1 actually states that even if K is L 2 continuous, its 
inverse is not bounded and hence not computable in a satisfying way, but 
the weaker assumption that K : W~ u / 2 — > yV u l 2 is continuous holds (see 
[22]). 

Let us now give an intuition of the procedure. Decomposing the inner 
product (KfjipjJ), j > 0, rj G Zj on every space Mi, I > via Parseval's 
formula, coupled to Proposition 1.1 entails 

(/>^) = E(( Ki )- i (^/y»^) 

leLj 

Hence a first natural estimator of (f,ij)j„) would be 

Remark that the elements t^j ^, I & Lj are easily computable thanks to 
the identity 

Y L) = H^j)yJn(v) for all I G Lj, \m\ < I 

However, the presence of noises on both the signal and opera- 
tor requires an additional treatment. This is realized through a 
preliminary processing T op (K l ) of K l and a secondary treatment 
T sig { y^ ((Top(K l ))~ flfg,t/>^ n )) of the resulting estimator. 

3.1 Main procedure 

Suppose that Assumption 3.1 holds. Define J, the maximal resolution 
level, such that 

2 J = AL(e^bg^)" 1 A {5y^6\y 2 \ (3.2) 
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for a positive parameter A. For j E N, define 

lj = mm{l E Lj, || (K l ) _1 1| ^ 0} 
(with the convention min0 = +00), and, for positive constants k and 



Oi,s = K y2TrTJv / |log<5| (3.3) 



iTopCJjf'O^Hop^eVTlBgir V T op 2^/ 2 <5^Pf) if Z, < 00 



+00 if 2j = +cx) 

(3-4) 

Define also the ensuing regularizing procedures T s i g and T op , inspired 
from [6] and [16], defined by 

J 

Vg E L 2 (S 2 ), T siff ( 5 ) = ^ ^ ( 5) ^)l {Kflil/j . >))|>s . (5i£)} ^ 

i=0 r,£Zj 
2 J+1 

V* E L 2 (S0 3 ), T op {K) = K l l U(Kl) _ ^ 

1=0 

The estimator / of / is defined by 

f = T s J(T op (K s )y 1 g t 



j<J T]£Zj 

23+1 

A „Z „/.Z 



where we noted p jiV = ^ ((i^) 4 r ||(J ^ ^n^-i^, 



/=2J- 



Theorem 3.2. Let w>l, s>-.r>l and M > 0. Lei f > 0, let 

— ; 7T 7 — — 7 

Qi > Q2 > 0. T/ien /or sufficiently large n and r } /or aZZ p G [1, +oo[ ? 
sup E ||7- f% <(| logeir^ev^bg^)^ 

/6B«, r (Af),K6/C I/ (Qi ) Q 2 ) 



vdiog^ir^^v^g^r^ 

(3.5) 

where < means inequality up to a multiplicative constant depending only 
on p,s, 7r, r, M, u, Qi,Q2, A, ft, r s j ff and T op , and where the exponents fj,(d) 
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are defined for d G N by 
( 



s+v+i 



i~2/tt+2/p 

k S-2/7T+J/+I 



i/ a >(„+')(£ -1) 

or s = + f )(- — 1) r < 7T 



i/f < a <(„+g)(*-l) 



Theorem 3.3. Under the same hypothesis as in Theorem 3.2, 

E ||/ - /Hoc <^b^( £ ^lo^r' (2) 



sup 

/GBj ir (M),KG/C,(Qi,Q 2 ) 



V v^bg5](5v^g51) M ' (1) 



(3.6) 



where the exponents n'(d) are defined by 

s - 2/vr 



//(d) 



s - 2/vr + ^ + | 



An explanation of the shape of the thresholding procedure is necessary 
here. The term \\T op (K l: > ) -1 \\ op is meant to replace the classical term 2? v 
(see [16]). Indeed, Lemmas 4.1 and 4.2 show that with high probability, 
this term behaves as 2 3U . The procedure BND is hence adaptive for a 
wide range of LP losses and over a wide range of function and operator 
spaces, with respect to s,ir,r,Qi,Q2, and v. 

What can we say about the case where v is already known or infered? 
First, in that case, we can directly replace \\T op (K lj ) _1 || op by 2? v in the 
threshold level 3.4. Secondly, the lower bound in Assumption 3.1 becomes 
unnecessary (i.e. we can set Q2 = 0) and the class of operators for which 
the rates of Theorems 3.2 and 3.3 are available hence becomes wider. 
Finally, we can use a sharper maximal level 

2 J = \[(e^\\oge\)^ A (6y/\ log 6\) 

which will lead to the same rates of convergence, while avoiding unneces- 
sary calculations. This is a non negligible gain, since needlets are costly 
with regard to computation time. 

Although we chose to work in a white noise model for the convenience 
of calculations, the algorithm and ensuing results should be easily tran- 
scriptible to the density estimation framework, in which one observes 
direct realizations (9\Xi, n X n ) of 9X and a noisy version Kg of K. 
More generally, the presence of a blockwise-SVD decomposition combined 
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with properties of the ensuing needlets frame similar to Part 2 ensure the 
applicabilty of the scheme with adapted convergence rates. This includes 
in particular the corresponding one dimensionnal problem (equivalent to 
deconvolution in a periodic setting), where the rates improve on those of 
Cavalier and Raimondo [4] and Hoffmann and Reifi [14]. Another practi- 
cally relevant example concerns the operators defined on S rf , d > 1 via 



and ip is a bounded integrable function on [—1, 1]. In this case, as shown by 
the Funk-Hecke theorem (see [11]), spherical harmonics realize a SVD of 
K. On the other hand, the construction of needlets generalizes naturally 
to S d ([21]), and the rates derived hence change to 



This sheds a light on the role of the dimensional factors obtained in the 
rates, the term n{d) accounting for the dimension of the underlying space 
while the term (i(0) concerns the efficiency of the set of functions chosen 
for the Galerkin projection, via the size of the blocks obtained. 
The speed of convergence gives an explicit interplay between 5 and e, 
including the possible case where S S> e. If 5 = 0, the rates coincide 
with the results of Kerkyacharian et al. [16] (actually, the algorithms 
themselves are nearly identical), which are optimal in the minimax sense 
(up to a log factor, see Wilier [27] for a sketch of proof). The two regions 
s > (y + 1)(^ — 1) and s < (v + 1)(— — 1) are classic in non parametric 
estimation and respectively refered to as the regular case and the sparse 
case. 

The optimality (in a minimax sense) of the procedure is beyond the scope 
of this paper. We don't know if the <5-rate is minimax in general, though 
it is trivially the case if the e-term dominates the 5-term in theorems 3.2 
and 3.3. Let us point out that Delattre et al. [6] attained a faster (and 
optimal) rate in the particular case where p = ir = r = 2. However, the 
corresponding framework was more restrictive since it (crucially) requires 
the set of inequalities 



, which unilaterally entail Assumption 3.1. Secondly, the procedure de- 
veloped therein relies strongly on the conveniency of spherical harmonics 
to represent both the operator and the signal sparsely, and isn't directly 
transcriptible in the present setting without additional restrictions on the 
behaviour of K (a direct transcription of the algorithm actually shows 




(|log £ |)^He^loi^)^)v(|log5|f- 1 (5Vrlog51)' i(0) 




(3.7) 
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very poor practical performances). 
3.2 Practical study 

We present the practical numerical performances of BND and com- 
pare it to the Blind Blockwise Deconvolution algorithm (BBD) of 
Delattre et al. [6]. The sets of cubature points in the simulations 
that follow have been taken from the web site of R. Womersley 
http://web.maths.unsw.edu.au/~rsw. We proceed with the following 
choices of parameters : 
Data: the target density / is given by 

f(uj) = exp(-2 * \\u - wi|| f i(R3))/c 

with uj\ = (0, 1,0) and c = 0.6729. Concerning the operator K, we choose 
it among the class of Rosenthal laws on SO3. These distributions find 
their origins in random walks on groups (see [24]). K is said to follow a 
Rosenthal distribution of parameters a G]0; tt\ and v > on SO3 if, for 
I > 0, \m\ < I, we have 



A Rosenthal law hence provides a concrete example of operator with DIP 
v > 0. We will take a = ir and v = 1. 

Tuning parameters: we set A = 1 in 3.2. The concrete choice of ade- 
quate thresholding constants k and r is a complex issue. Our practical 
choices will be based on the following remark, inspired from Donoho and 
Johnstone [7]: in the case of direct estimation on real line, the universal 
threshold which is both efficient and simple to implement, takes the form 
2-y/| log e|. A consistent interpretation is to consider that this threshold 
should kill any pure noise signal. We will adapt this reasoning to the case 
of interest. 

Choice of k : we use as a benchmark the case where K l is the null matrix 
of M2i+i(M) for I > 1 (this corresponds to the case where the law of 9 
is uniform over SO3). Given 6 large enough, the smallest value k$ such 
that , in the Fourier basis, the number of remaining levels Z < 10 is zero, 
is hence retained. The results are reported in table 1 and give n = 0.8. 
Choice of r s ig and r op : It is clear that the role of r s i g and r op is to control 
the influence of the signal (resp. the operator) error. To chose r s i g (resp. 
t^), we therefore chose e s i g > 5 s i g > (resp. 5 op > e op > 0) large enough. 
Following Kerkyacharian et al. [16], we use the uniform density u on S 2 
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K 


0.3 


0.4 0.5 


0.6 0.7 


0.8 


Nr op 


10 


9 9 


8 2 






Table 1: Chosing of n. Nr op is the average number, computed on a base of N = 10 
realizations, of levels I < 10 such that T op (Ks) 1 / 0. We have S = 10~ 3 . 





0.5 


0.6 


0.7 


0.8 


0.9 




Top 


0.1 


0.2 


3 


= 


3 





3 








3 


= 








3 


= 1 


10 


6 











3 


= 1 








3 


= 2 


20 


9 


2 


1 





3 


= 2 


4 





3 


= 3 


94 


22 


8 


4 





3 


= 3 


127 






Table 2: Chosing of r. For (5 B i g ,£ S ig) = {s op ,5 op ) = (10~ 4 , 10 -3 ) and each value of 
r, we computed 10 times the described procedure and reported the average number of 
remaining needlet coefficients at level j. 



as a benchmark. We have (u,ij>j„) = for j > 1, rj G Zj, consequently 
the observations (g £s . g ,ipj^) , j > are pure noise. We hence simulate 
Kg sz and, integrating the precedently computed value of k, apply the 
procedure for increasing values of T s i g (resp. r op ) until all the computed 
coefficients (u, ipj^) are killed for j < 3. The results are reported in table 
2 and give T s i g = 0.9, r op = 0.2. 

We compare the performances of BBD (with parameters taken from 
[6]) and BND for 5 G {3.10~ 3 , 10" 3 , 10~ 4 }, e G {lO^lO" 4 }. We per- 
form the algorithm and run a Monte Carlo method over N = 20 simula- 
tions in order to determine the mean squared error and mean L°° error, 



5 


£ 


E\\f-fh 


EII/-/II00 


BBD BND 


BBD BND 


3.10~ 3 


io- 3 


0.2214 0.1018 


0.3877 0.3457 


10~ 4 


0.1691 0.1606 


0.2155 0.3377 


10~ 3 


io- 3 


0.2202 0.1268 


0.3846 0.2268 


10~ 4 


0.0834 0.0595 


0.1926 0.1572 


10~ 4 


10~ 3 


0.2231 0.1257 


0.3925 0.2237 


10^ 4 


0.0824 0.0584 


0.1924 0.1568 



Table 3: Average normalized I? and L°° loss of BBD and BND. 
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each of whom is approximated by the discrete equivalents calculated from 
a uniform grid of 4096 points on § 2 at each step. Results are reported in 
table 3 and confirm the shape of the obtained rates: BND clearly outper- 
forms BBD in every situation except when the operator noise is highly 
predominant ((S,e) = (3.10 -3 , 10 -4 )). This was expectable since K also 
verifies 3.7 so that the rates of [6] are available. 

For particular realizations of g £ and Kg, we plot in figure 2 : the original 
shape of the density, and the results of the different algorithms in the 
form of spherical views seen 'from above'. The figures show the better 
adaptivity of BND to the 'spiky' shape of the target density. 



4 Proof of theorems 3.2 and 3.3 



Preliminary lemmas 

We first establish deviation bounds on the variables |/3;„ — /3„- J which 
will be useful further. We begin by the following lemma which concerns 

the deviations of \\B \\ op . A reference is Davidson and Szarek [5]. 
Lemma 4.1. There exists /3q and cq independent from I G N such that 



Vt > Po, P((2Z + l)- l/2 \\B >t)< exp(-c t(2/ + l) 2 ) 
Rowing majoratu 

n\\B l \\ p op ]<i p/2 



A simple corrolary is the following majoration of the moments of\\B l \\ p 



Lemma 4.2. We introduce further the event {\\5B \\ op < ai} with 

ai = pOu for some < p < ±. On A t = {||(-Ka,«) _1 ||op < O^j} and 

{\\SB \\ p < ai}, since a/ satisfies O^l a,i = p < \, by a usual Neumann 
series argument (see Delattre et al. [6]), 

\\(K i 8 y% p < -^-ii^rXp 

and iKK^n^^a-p)- 1 !!^)- 1 !!^ 

Lemma 4.3. Let ~S]{5,e) = t2^ (e y/\]ogF\ V 2^/ 2 5y/\\og5[) with r = 
T s iq V T op . In the setting of Theorem 3.2, for all j < J, rj € Zj, for all 
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1.5 1 0.5 -0.5 -1 

Oy 

(a) Target Density 




1.5 1 0.5 -0.5 1.5 1 0.5 -0.5 

Oy Oy 

(b) BBD, e = 10" 3 (c) BND, s = 1(T 3 




1.5 1 0.5 -0.5 -1 1.5 1 0.5 -0.5 -1 

Oy Oy 



(d) BBD, e = 1CT 4 (e) BND, s = 1(T 4 

Figure 2: Spherical view from above of the results of the two algorithms with noise 
level 8 = 10" 3 
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p > 1 

P(|3 M -/3 jV ,|>^,e))<e K2 V^ 2 (4.1) 
nhv ~ Pj,v\ P l £ V V I^Pl^j (4.2) 

E[sup |3 M - < (j + lf[(e2^r V (4^(-V2))P v ^ . . > 

(4.3) 

2 

w/iere 2- 20 ~ S 2v + 1 . 

Proof of Lemma 4-3. All inequalities can be derived from the study of 
P(|/3,- „ — /3j- )T7 1 > t) in each case. Recoursing to the identity 

(K l 5 r\K l f + eW l ) -f = -8{K l s )- x B l f + {K l s )~ l eW l (4.4) 
which holds for every I 6 N, and using Parseval's formula, we decompose 

hn Pj„ = E W^r 1 ^ - /'^i,>i { A, } - </'^L>i {Af} ] 

= J2 (S(K l s )-H {Ai} B l f, ^ + J2 ((K l s )-\ Ai} eW l ,^. n ) 

leLj 
=1 + 11+ III 

So we have to study the deviation bounds of these three terms. Term I 
can be decomposed as 

' = -E(^)^'/,4)> w (i,p. | , < , } + Vu>,}) 

= IV + v 

In order to treat the term IV, we introduce the operator 

defined for j < J. Since -ff and B are both stable on every space Hj, and 
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since (ip j>ri , il> h>a ) = if \j - h\ > 1, 

=1 E ^Q^h^^j^if^h 



h=j — + l 



<( ^ i(5Q^^,^ ij7) )r') e kwjd^w) 

h=j-i,3',3'+i h=i-i,i.f+l 

+ ( e iWi^,«»^)D'( E K/.^.jr')^ 1 ^} 



h=j — 1,3,3+1 h=j — 1,3,3+1 



where we used Holder's inequality with — + — = 1. Now, if it < 2, then 
7r' > 2 and 2.5 together with Proposition 2.4 entail 

( E KSQj^^X)^ <( E IWi^, Q '^>l 2 )" 

fe=j — 1,3,3+1 h=j — 1,3,3+1 



< 52 j(l/+1/2) 



Moreover, since / S 5* r , we have 



h=j— 1,3,3+1 

aez h 



If 7T > 2, a similar argument added with the Besov embedding r 6 

s-2(-- A) 

B^i r 71 *" leads to the same bounds. Finally, 

mm >t)<¥ (iiJ r Q J ii op 2-^ s -i +i ) > t) 

< p(2-j/2||p l ..Bp i .|| op > ^-l 2 i(.-l/2-( S -2/ 7 r)) ) 

<exp(- Cot ' 22 ' )l f ., ^ (4.5) 

where we noted Pi,, the orthogonal projector onto ^^Hz and used 

Lemma 4.1, Lemma 4.2 together with the fact that s > ^. Turning to V, 
a direct application of Lemma 4. 1 entails 

P(||<5s'|| op > of) < 5 c ^ 2{21+1 ^ k2 (4.6) 
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So we have 

F(\V\ >t)< F(5 £ IK^^II^II/illll^n^fA,} 1 /.^., >0 I > *) 
lei,- l ° p ai > 

<J2n\\(K^B\ op i {Ai} i { , } >t) 

< ^F((2Z + l)- 1 /2||^|| op >tKlog l/2 5) l/2 JC „p2 (2;+1) 2 K2/2 

< £ exp( _ Co(2/ + l)2 t 2 K 2 log 5/2) 

< ^2^/2 exp(-c 2 2 ^ 2 K 2 log 5/2) 
Turning to the term II, we decompose in a similar way 

= VI + V// 

Conditionning on and applying Lemma 4.2, we derive, for 

all t > 0, 

nwi\ >t)=p(\ ^( e (Ki)- i ^,4)i {Ai} i, || . 1|k ,|>t) 

< «P ( " 2^7) (4-7) 
As for VII, employing Cauchy-Scwharz inequality, 4.6, and condition- 
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ing on (B l )i £L we write 



n\vn\ >t) = p(i ]r (^i)- 1 ^,^)!^}^ lli3% >oi) i > t) 
<E p (i< e (^)- 1 ^ , .^> 1 w 1 w u>ai} i>*) 

/ f= T. ■ V J 



< exp( _^ 2 l lo fl 23 )^ 2 ^v 2 



4e 2 

It remains to treat term III. We claim that 

1 {Af} ^ 1 {||^ , [|>O i>4 } + 1 {[|(JC I )- 1 II*>O i 7 4 1 /2} < 48 ) 
(for a proof, we refer to Delattre et al. [6]). Hence, 

\IU\ <| ^{f A), v )\^{ ¥ B\\>O l>5 } + I E^ / '^L)l 1 {||(K i )- 1 l| oP >Or a 1 /2} 

=VIII + IX 

As 

{UK'rXp > orj/2} c{i> c (5^bgif)-^} 

for a constant c depending only on k and Q2> we derive noting 
j = [c(5^/\\og5\y^ 7 + T ^\ + 1 so that for all j < j , I £ Lj 

1 {\\{K l )^\\ op >0-j/2} = °' 

n\m>t)<i {t<l ^ l} i {j > j0} (4.9) 

Now, a quick application of Lemma 4.1 entails 

¥(\\8B l \\ > O hS ) < 5 cok2( - 21+ V 2 
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Hence, 

p(\vin\ >t)<F(\ T,(f l ^n hs ^ l>0is] \ > t) 

^E P ( 1 W||>o !5 } > *) 

^E^wii^l 1 ^ 1 

<£p(||^||>0 M ) 1/2 l {t < 1} 

~ {*<!} 

4.1 results directly from the previous deviation inequalities. Inequalities 

4.2 and 4.3 are both applications of the well known formula 

E[\X\ P ] = I pu p ~ l F(\X\>u)du<p [ AP(|X| > u))du 

Ju>0 Ju>0 

Indeed, noticing that, if one takes k and r large enough, the leading 
terms in their studies are given by 4.5, 4.7 and 4.9, inequality 4.2 follows 
immediatly. As for inequality 4.3, we have 

E[sup|3 ji?? -Pj,r,\ p ] < / Pu^ 1 {I /\ F ( sup \p j)r} - (3 j J >u))du 

ri&Zj Ju>0 V^Zj 

<p f vP- l (l A 2 2j ¥(\p jv - /3j \ > u))du 

Ju>0 

Moreover, considering only the terms 4.5, 4.7 and 4.9, we have 



which entails 4.3. □ 



4.1 Proof of Theorem 3.2 

Proof. We shall only investigate the case where p > ir, since for p < tt, 
we have B^ r C Bp r . The IP loss of the procedure can be decomposed as 
follows: 

e ii? - /us < e || E Cf - f, ^,v)^J p P + ii E E /^JS 
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Since / £ B^ r , the second term is bounded by 

2 -Jp( s -2(i-i)) 
S _2(I-I) 

It is not difficult to show that — x ^_ l p is always larger than fi(x) (see 

[16]). Hence M") < < s-2(±-|) and //(*/- 1/2) < < 

2(s — 2(i — -)). To bound the first term, we apply Holder's inequality 
and 2.5, to write: 



E|lE£</-^>;,J 



ip 



+EE E 0^i p i {l ^i< Si(5l£)} ]ii^n^ 

j<J T]<^Z 3 

The first step is to replace Sj(S,e) by a quantity explicitly depending 
on 2 Jiy , namely e). Write hence 



S =^E E E [fen ~ t 3 3Sl\ n {\p j , 1 \>S J (S,e)} 1 {h< + ^} 

( 1 {lHA , i||<«, J } +1 {||«A l i||>, Jj })Jll^Hp 

<^(E E e [i^-^i pi {i^„i> W }] itoJS 
£ E E [i3 3 -,, - 0;j*] P/a ^ ( ^ +1)V/ i^gig) 



j<J r/eZj 

where we applied Lemma 4.2, 4.6 and Cauchy-Schwartz inequality. It is 
clear that the second term is negligible for k large enough. In a similar 



25 



way, 

g =JP -1 [^3-n\ Pl {\P ] J<S ] (S,e)}( 1 {h< + ^} + 1 {l ] =+co}) 

j<J T]&Zj 

( 1 {||^||< % .} +1 {||^||> % .})J II ^ II P 

< Jp -'E E ( E [i^i P1 {^i<5- ( , £ )}] + i^i p p(n^n >%) 

j<J rjeZj 

Moreover, thanks to 4.8, 

1{ ' J=+0 ° } - ~ '{ll^H^,} +1 {ll(^)-|| 0P >0 2 -V2} 

It is clear (see the treatment of Term III and 4.6) that the domining 
term is 

Hence 

E ii E E (f - ^,v)^Jp z JP ~' {i in + iv) 

with 



Bb- E [\Pj,n / 3 J,.;| Pl {|3 ji) |>s7(5, e) } 1 {| / 3 J , rJ |>57(5, e )/2}]ll^llp 

j<J,r-)£Zj 

Bs= 22 E [\p jtV - pjJPlrp l^i.e)} 1 /^ rJ |<s7(5,s)/2}] H^J.Jp 

5& = E l^>| PE [ 1 {|^ IJ |<5-(5,e)} 1 {|/3„|>2S-(5, e )}]H^llp 
j<J,V£Zj 

SS= J! \Pj,v\ PE [ 1 {|3 J , 7J |<57(5,e)} 1 {|/3 J ,„|<2S7(5, e )}]H^llp 

j<j,n^Zj 

We can now treat the terms Bs, Bb, Sb and £s, applying 2.5, 4.1 and 
Cauchy-Schwarz inequality: 

< J p - 1 E E E [i^, - ^ ,i>^ e)/a} ] ii^ii; 

3<Jr,&Zj 

£ JP_1 E E (^T V (<^- 1/2) ) p V 1/3^1^)2^ V ^ 2 ) 

j<Jnez 3 
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Moreover, 



Sb< ,P~ l Y, E ^iJ^hn-PJ > Sj^s))]]^ 



lip 

J,V"P 

j<j veZj 



JP- 1 (e t V(T ) 

since / S Bp^ 1 1 < p \ Hence in both cases the rate of convergence is 
smaller than what is claimed for sufficiently large r. Turning to Bb and 
Ss, we write, for all z,z' > 0, 

■j<Jn£Zj 

< ^ ^ ((err v (^-^f v l^/i^^i^g^^ll^ir 

j<J r/eZj 

j<J riGZj 

+ JP- 1 ( ( 5^b^") p - z '^2^^- 1 / 2 )^'^- 2 ] \^,n\ Z ' 

j<J r/eZj 

+ i7 P-l 2 -3op(--2(i-i)) 



and 

<^ P_1 E E 1/9, n |* ( 1 r / 1 



{|/3 J ,, ; |<2r2J("-i/2) (5% /|1^5|] 

+ j^ i (5^bg^) p - z 'E 2J[( ^ 1/2)(p - 2 ' )+p - 2] E 

We already bounded 2 ■ ?op v ,s 2 (~ p'J ~ <5 P "+1/2 ; so in both cases 
we have the same term to bound. This term can be further writen as 
R(s, v, z) + R(8, v - 1/2, z') where 

^,y, Z ) = j^- i (x^ioi^i)^E 2i[2/(p " )+ ^ 21 E ww^iAi 

We only give a brief overview of the treatment of R; a detailed one is 
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present in Kerkyacharian et al. [16]. First, we split R as follows 
R(x,y,z) = ^[{x^l^iy- 21 E \Pj,v\ Z1 W^Jp 

j<J r/GZj 

+ (x^T^iy-^ e 2 j[y(p - z * )+p - 2] E WW^ji 

j>Jo V^Zj 

where z\,z% % Jo are to determine. Consider first the case where s > (y + 
l)(f - 1). Note q = Pjgzr Taking z 2 = tt, *i = <f < g and 2 J °? ~ 



(a:-\/| log a; |) 1 entail 

R(x,y,J Q ) < (logxf-^xy^^y-i 



which is the desired bound. Now consider the case where s < (y+l)(^ — 1) 
and note q = p ^rpp^zfep - Take z\ = tt, z 2 = q > q and 2 Jo i^ 1 " 1-2 ^ ~ 
(xy/\ log a^l) -1 . We obtain 

which ends the proof. □ 



4.2 Proof of Theorem 3.3 

Proof. Write similarly 
II? - /Hoc < E || E E (Pi,n ~ ^felU + || £ £ 

j<J rjeZj j>J T]£Zj 

The second term can be handed as before. We decompose the first term 
in the following way, using 2.5 for p = oo, and applying the same sketch 
of proof as in theorem 3.2, 

E 11 E E (hv - ^>M,joc < E E su p \hn - 

<Bb + Bs + Sb + Ss 
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with 

Bb = £ 2* E [sup |3 M - / 3 ^l 1 { |3^|>^ )£ )} 1 {| /3 , J >^ )£ )/2}] 



5a = E 2 J E [sup 13^ - ^,,|l { |3^| > ^ (5 , £)} l { | /3 ^|<^ )£)/2} ] 



Sb = E 2i ™v 1^1 E [ 1 {i/3 J „ ; i<5-( 5 , e )} 1 {i/ 3j g>25-(,, £ )}] 



We have, using inequatlity 4.3, 



Bb < J2 2*E sup |/3 Jf „ - /3^1 {| ^ >w/2} 

^E^ 1 {^,|/ J ,.,|>^e) /a }2 , B^l3i,,-^l 

< 

< 2 J ^ +1 \j 1 + l)e + 2 /l ^+ 3 / 2 ' (Ji + 1)5 + X] 2*'|/3 



where Ji is chosen so that, for j > J\, |/9j J < T£-\/l loge|2 J> /2. We can 
take for example (see [16]) J\ verifying, for a certain constant B, 

2 h = B{e^\T^F\y {s+u+1 ~ 2/wrl 

Similarly, taking 

2^=C(5^bg^)" ( ^ +1/2 ^ 2/7r) " 1 

for a certain constant C implies \/3j„\ < t5^/\ log 5|2 J, '^~ 1 / 2 ) /2 for all 

j < I\. The term 2 J 1/3^1 is easily treated. This finally leads to the 

j>jo 

rate 

56 < |loge|e^' (2) V |log5|^' (1) 

<[E2^VT^|2^ + E2 i |/3J 

j<Jl i>Jl 

V [ ^ 2^^1oi^2^^ 1 / 2 ) + ]T 2^, 
j</i i>/i 
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which gives the proper rate of convergence. Turning to Bs and Sb, we 
write, using inequalities 4.1 and 4.3 



Bs < g 2^E [sup - ^, ;|> ^, £)/2} ] 



< £ 2* E[sup \P jin - ^fl^PCar? G Z,-, - /3 3 -,| > Stfrfffl 1 ' 2 

< 2 J [(J + 1) V 52^-^) V |/3, ,|1 {j -> Jo} ] ^'(^ V ^ 2 )] 1/3 
Now apply inequality 4.1 and the fact that J < 2~ J to derive 



<^2 2 ^P(|3,,-/3,,|>^, £ )) 
j<J 

It is clear that for a well chosen r these terms are smaller than the an- 
nounced rates. 

□ 
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